-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build global ordinals terms bucket from matching ordinals #30166
Conversation
The global ordinals terms aggregator has an option to remap global ordinals to dense ordinal that match the request. This mode is automatically picked when the terms aggregator is a child of another bucket aggregator or when it needs to defer buckets to an aggregation that is used in the ordering of the terms. Though when building the final buckets, this aggregator loops over all possible global ordinals rather than using the hash map that was built to remap the ordinals. For fields with high cardinality this is highly inefficient and can lead to slow responses even when the number of terms that match the query is low. This change fixes this performance issue by using the hash table of matching ordinals to perform the pruning of the final buckets for the terms and significant_terms aggregation. I ran a simple benchmark with 1M documents containing 0 to 10 keywords randomly selected among 1M unique terms. This field is used to perform a multi-level terms aggregation using rally to collect the response times. The aggregation below is an example of a two-level terms aggregation that was used to perform the benchmark: ``` "aggregations":{ "1":{ "terms":{ "field":"keyword" }, "aggregations":{ "2":{ "terms":{ "field":"keyword" } } } } } ``` | Levels of aggregation | 50th percentile ms (master) | 50th percentile ms (patch) | | --- | --- | --- | | 2 | 640.41ms | 577.499ms | | 3 | 2239.66ms | 600.154ms | | 4 | 14141.2ms | 703.512ms | Closes elastic#30117
Pinging @elastic/es-search-aggs |
@@ -103,11 +101,22 @@ public SignificantStringTerms buildAggregation(long owningBucketOrdinal) throws | |||
|
|||
BucketSignificancePriorityQueue<SignificantStringTerms.Bucket> ordered = new BucketSignificancePriorityQueue<>(size); | |||
SignificantStringTerms.Bucket spare = null; | |||
for (long globalTermOrd = 0; globalTermOrd < valueCount; ++globalTermOrd) { | |||
if (includeExclude != null && !acceptedGlobalOrdinals.get(globalTermOrd)) { | |||
boolean needsFullSan = bucketOrds == null || bucketCountThresholds.getMinDocCount() == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - "needsFullScan"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jimczi !
for (long globalTermOrd = 0; globalTermOrd < valueCount; ++globalTermOrd) { | ||
if (includeExclude != null && !acceptedGlobalOrdinals.get(globalTermOrd)) { | ||
boolean needsFullScan = bucketOrds == null || bucketCountThresholds.getMinDocCount() == 0; | ||
long maxId = needsFullScan ? valueCount : bucketOrds.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make both final
for (long globalTermOrd = 0; globalTermOrd < valueCount; ++globalTermOrd) { | ||
if (includeExclude != null && !acceptedGlobalOrdinals.get(globalTermOrd)) { | ||
boolean needsFullScan = bucketOrds == null || bucketCountThresholds.getMinDocCount() == 0; | ||
long maxId = needsFullScan ? valueCount : bucketOrds.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make them final?
Get globalOrd by |
We use |
thanks @jimczi. I read the code carefully and it's my fault. |
The global ordinals terms aggregator has an option to remap global ordinals to dense ordinal that match the request. This mode is automatically picked when the terms aggregator is a child of another bucket aggregator or when it needs to defer buckets to an aggregation that is used in the ordering of the terms. Though when building the final buckets, this aggregator loops over all possible global ordinals rather than using the hash map that was built to remap the ordinals. For fields with high cardinality this is highly inefficient and can lead to slow responses even when the number of terms that match the query is low. This change fixes this performance issue by using the hash table of matching ordinals to perform the pruning of the final buckets for the terms and significant_terms aggregation. I ran a simple benchmark with 1M documents containing 0 to 10 keywords randomly selected among 1M unique terms. This field is used to perform a multi-level terms aggregation using rally to collect the response times. The aggregation below is an example of a two-level terms aggregation that was used to perform the benchmark: ``` "aggregations":{ "1":{ "terms":{ "field":"keyword" }, "aggregations":{ "2":{ "terms":{ "field":"keyword" } } } } } ``` | Levels of aggregation | 50th percentile ms (master) | 50th percentile ms (patch) | | --- | --- | --- | | 2 | 640.41ms | 577.499ms | | 3 | 2239.66ms | 600.154ms | | 4 | 14141.2ms | 703.512ms | Closes #30117
* master: (7173 commits) Bump changelog version to 6.4 (elastic#30217) [DOCS] Adds native realm security settings (elastic#30186) Test: Switch painless test to 1 shard CCS: Drop http address from remote cluster info (elastic#29568) Reindex: Fold "from old" tests into reindex module (elastic#30142) Convert FieldCapabilitiesResponse to a ToXContentObject. (elastic#30182) [DOCS] Added 'on a single shard' to description of max_thread_count. Closes 28518 (elastic#29686) [TEST] Redirect links to new locations (elastic#30179) Move repository-s3 fixture tests to QA test project (elastic#29372) Fail snapshot operations early on repository corruption (elastic#30140) Docs: Document `failures` on reindex and friends Build global ordinals terms bucket from matching ordinals (elastic#30166) Watcher: Ensure mail message ids are unique per watch action (elastic#30112) REST: Remove GET support for clear cache indices (elastic#29525) SQL: Correct error message (elastic#30138) Require acknowledgement to start_trial license (elastic#30135) Fix a bug in FieldCapabilitiesRequest#equals and hashCode. (elastic#30181) SQL: Add BinaryMathProcessor to named writeables list (elastic#30127) Tests: Use buildDir as base for generated-resources (elastic#30191) Fix SliceBuilderTests#testRandom failures ...
Thank you @jimczi and @pakein! I came across this issue when investigating a performance issue for one of my customers aggregations and applied this patch to a forked terms aggregation in a plugin for ES Any chance of getting this in before |
It was too late for 6.3.0 and it's not a critical bug (though this is arguable ;) ) hence 6.4.0.
Nice numbers ! Though 20s for 3M docs seems quite high even with 4 levels. How many terms do you retrieve per level ? |
I discussed with @colings86 and we've decided to backport this pr in 6.3.1. It's really too late for 6.3.0 but if there is a 6.3.1 (which is not guaranteed) this pr will be included. |
@jimczi Thank you, For our use-case, the first 3-levels are 100, and the 4th is unbounded (essentially all doc ids for the bucket). Its this 4th level id collection that is the issue and I am working on trying to get them to remove this "requirement". |
The global ordinals terms aggregator has an option to remap global ordinals to dense ordinal that match the request. This mode is automatically picked when the terms aggregator is a child of another bucket aggregator or when it needs to defer buckets to an aggregation that is used in the ordering of the terms. Though when building the final buckets, this aggregator loops over all possible global ordinals rather than using the hash map that was built to remap the ordinals. For fields with high cardinality this is highly inefficient and can lead to slow responses even when the number of terms that match the query is low. This change fixes this performance issue by using the hash table of matching ordinals to perform the pruning of the final buckets for the terms and significant_terms aggregation. I ran a simple benchmark with 1M documents containing 0 to 10 keywords randomly selected among 1M unique terms. This field is used to perform a multi-level terms aggregation using rally to collect the response times. The aggregation below is an example of a two-level terms aggregation that was used to perform the benchmark: ``` "aggregations":{ "1":{ "terms":{ "field":"keyword" }, "aggregations":{ "2":{ "terms":{ "field":"keyword" } } } } } ``` | Levels of aggregation | 50th percentile ms (master) | 50th percentile ms (patch) | | --- | --- | --- | | 2 | 640.41ms | 577.499ms | | 3 | 2239.66ms | 600.154ms | | 4 | 14141.2ms | 703.512ms | Closes #30117
Do {the tag swap + 6.3 commit} mean anything in regards to the possibility of having this in 6.3.0? Did anything change? |
@fredgalvao yes this change will be in 6.3.0 |
The global ordinals terms aggregator has an option to remap global ordinals to
dense ordinal that match the request. This mode is automatically picked when the terms
aggregator is a child of another bucket aggregator or when it needs to defer buckets to an
aggregation that is used in the ordering of the terms.
Though when building the final buckets, this aggregator loops over all possible global ordinals
rather than using the hash map that was built to remap the ordinals.
For fields with high cardinality this is highly inefficient and can lead to slow responses even
when the number of terms that match the query is low.
This change fixes this performance issue by using the hash table of matching ordinals to perform
the pruning of the final buckets for the terms and significant_terms aggregation.
I ran a simple benchmark with 1M documents containing 0 to 10 keywords randomly selected among 1M unique terms.
This field is used to perform a multi-level terms aggregation using rally to collect the response times.
The aggregation below is an example of a two-level terms aggregation that was used to perform the benchmark:
Closes #30117